Goto

Collaborating Authors

 competitive environment


Proximal Learning With Opponent-Learning Awareness

Neural Information Processing Systems

Learning With Opponent-Learning Awareness (LOLA) (Foerster et al. [2018a]) is a multi-agent reinforcement learning algorithm that typically learns reciprocity-based cooperation in partially competitive environments. However, LOLA often fails to learn such behaviour on more complex policy spaces parameterized by neural networks, partly because the update rule is sensitive to the policy parameterization. This problem is especially pronounced in the opponent modeling setting, where the opponent's policy is unknown and must be inferred from observations; in such settings, LOLA is ill-specified because behaviorally equivalent opponent policies can result in non-equivalent updates. To address this shortcoming, we reinterpret LOLA as approximating a proximal operator, and then derive a new algorithm, proximal LOLA (POLA), which uses the proximal formulation directly. Unlike LOLA, the POLA updates are parameterization invariant, in the sense that when the proximal objective has a unique optimum, behaviorally equivalent policies result in behaviorally equivalent updates. We then present practical approximations to the ideal POLA update, which we evaluate in several partially competitive environments with function approximation and opponent modeling. This empirically demonstrates that POLA achieves reciprocity-based cooperation more reliably than LOLA.


Proximal Learning With Opponent-Learning Awareness

Neural Information Processing Systems

Learning With Opponent-Learning Awareness (LOLA) (Foerster et al. [2018a]) is a multi-agent reinforcement learning algorithm that typically learns reciprocity-based cooperation in partially competitive environments. However, LOLA often fails to learn such behaviour on more complex policy spaces parameterized by neural networks, partly because the update rule is sensitive to the policy parameterization. This problem is especially pronounced in the opponent modeling setting, where the opponent's policy is unknown and must be inferred from observations; in such settings, LOLA is ill-specified because behaviorally equivalent opponent policies can result in non-equivalent updates. To address this shortcoming, we reinterpret LOLA as approximating a proximal operator, and then derive a new algorithm, proximal LOLA (POLA), which uses the proximal formulation directly. Unlike LOLA, the POLA updates are parameterization invariant, in the sense that when the proximal objective has a unique optimum, behaviorally equivalent policies result in behaviorally equivalent updates.


Crypto giant Tether CEO on cooperating with Trump administration: 'We've never been shady'

The Guardian

Paolo Ardoino, CEO of the cryptocurrency company Tether, was flying over Switzerland last week as he contemplated the changing regulatory landscape. Tether used to be at war with the establishment. Now it is the establishment. The crypto giant – tether is the most traded cryptocurrency in the world – has had a strange trip. Four years ago, banks were dropping Tether as a client, and regulators in New York had the company against the wall over questions about commingled client and corporate funds.


Proximal Learning With Opponent-Learning Awareness

Neural Information Processing Systems

Learning With Opponent-Learning Awareness (LOLA) (Foerster et al. [2018a]) is a multi-agent reinforcement learning algorithm that typically learns reciprocity-based cooperation in partially competitive environments. However, LOLA often fails to learn such behaviour on more complex policy spaces parameterized by neural networks, partly because the update rule is sensitive to the policy parameterization. This problem is especially pronounced in the opponent modeling setting, where the opponent's policy is unknown and must be inferred from observations; in such settings, LOLA is ill-specified because behaviorally equivalent opponent policies can result in non-equivalent updates. To address this shortcoming, we reinterpret LOLA as approximating a proximal operator, and then derive a new algorithm, proximal LOLA (POLA), which uses the proximal formulation directly. Unlike LOLA, the POLA updates are parameterization invariant, in the sense that when the proximal objective has a unique optimum, behaviorally equivalent policies result in behaviorally equivalent updates.


An Agent-based Model for Competitive Agents

arXiv.org Artificial Intelligence

Continuous-time Markov chains have been employed for decades to model a broad spectrum of stochastic systems, including queuing systems (e.g., [3]) and financial markets (e.g., [5, 7]). These models often represent agent behavior in interactive environments, where local and global interaction rules are used to simulate various physical processes (e.g., see [2, 6, 4] for examples). A key question in the analysis of these models is how to derive the transient or stationary probability distributions that capture the system's evolving dynamics or long-term behavior. In this paper, we develope a straightforward stochastic agent-based model for the analysis of agents displaying competitive behavior, striving to survive within a competitive environment. This model has applications across applied finance and social science (see [1]). For instance, in financial markets, firms compete to attract more customers and clients; job market participants frequently switch employers to better fulfill their financial needs; governments work to strengthen their economies, and so forth. In the subsequent section, we begin with a microscopic model where numerous groups or agents exist, each containing a finite number of subagents.


SUB-PLAY: Adversarial Policies against Partially Observed Multi-Agent Reinforcement Learning Systems

arXiv.org Artificial Intelligence

Recent advances in multi-agent reinforcement learning (MARL) have opened up vast application prospects, including swarm control of drones, collaborative manipulation by robotic arms, and multi-target encirclement. However, potential security threats during the MARL deployment need more attention and thorough investigation. Recent researches reveal that an attacker can rapidly exploit the victim's vulnerabilities and generate adversarial policies, leading to the victim's failure in specific tasks. For example, reducing the winning rate of a superhuman-level Go AI to around 20%. They predominantly focus on two-player competitive environments, assuming attackers possess complete global state observation. In this study, we unveil, for the first time, the capability of attackers to generate adversarial policies even when restricted to partial observations of the victims in multi-agent competitive environments. Specifically, we propose a novel black-box attack (SUB-PLAY), which incorporates the concept of constructing multiple subgames to mitigate the impact of partial observability and suggests the sharing of transitions among subpolicies to improve the exploitative ability of attackers. Extensive evaluations demonstrate the effectiveness of SUB-PLAY under three typical partial observability limitations. Visualization results indicate that adversarial policies induce significantly different activations of the victims' policy networks. Furthermore, we evaluate three potential defenses aimed at exploring ways to mitigate security threats posed by adversarial policies, providing constructive recommendations for deploying MARL in competitive environments.


CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents

arXiv.org Artificial Intelligence

Large language models (LLMs) have been widely used as agents to complete different tasks, such as personal assistance or event planning. While most work has focused on cooperation and collaboration between agents, little work explores competition, another important mechanism that fosters the development of society and economy. In this paper, we seek to examine the competition behaviors in LLM-based agents. We first propose a general framework to study the competition between agents. Then, we implement a practical competitive environment using GPT-4 to simulate a virtual town with two types of agents, including restaurant agents and customer agents. Specifically, restaurant agents compete with each other to attract more customers, where the competition fosters them to transform, such as cultivating new operating strategies. The results of our experiments reveal several interesting findings ranging from social learning to Matthew Effect, which aligns well with existing sociological and economic theories. We believe that competition between agents deserves further investigation to help us understand society better. The code will be released soon.


North America Artificial Intelligence (AI) in Operating Room Market Dynamics, Segments and Trends in the 2022-2031 - Muleskinner

#artificialintelligence

New York (US) – Key Companies Covered in the North America Artificial Intelligence (AI) in Operating Room Research are Activ Surgical Inc., Brainomix Limited, Caresyntax Corp, DeepOR S.A.S, ExplORer Surgical Corp., Hanson Meditec Co., Ltd., Holo Surgical Inc., LeanTaaS Inc., Medtronic Plc, Proximie, Scalpel Limited, Theator Inc. and other key market players. North America operating room AI market is projected to grow by 41.2% annually in the forecast period and reach $1,555.6 million by 2031, driven by the growing funding for Artificial Intelligence (AI), advancement in robotics and medical visualization technologies, and benefits of artificial intelligence-enabled surgeries over conventional surgeries. Highlighted with 26 tables and 52 figures, this 104-page report Artificial Intelligence (AI) in Operating Room: North America Market 2021-2031 by Offering (Hardware, Software), Technology (ML, DL, NLP, Others), Application (Training, Diagnosis, Analysis, Planning & Rehabilitation), Indication (Gastroenterology, Neurology, Urology, Orthopedics, Cardiology), End User (Hospitals, ASCs, Specialized Facilities), and Country is based on a comprehensive research of the entire North America operating room AI market and all its sub-segments through extensively detailed classifications. Profound analysis and assessment are generated from premium primary and secondary information sources with inputs derived from industry professionals across the value chain. The report is based on studies on 2018-2021 and provides forecast from 2022 till 2031 with 2021 as the base year.


On limitations of learning algorithms in competitive environments

arXiv.org Artificial Intelligence

Playing human games such as chess and Go has long been considered to be a major benchmark of human capabilities. Computer programs have become robust chess players and, since the late 1990s, have been able to beat even the best human chess champions; though, for a long time, computers were unable to beat expert Go players -- the game of Go has proven to be especially difficult for computers. However, in 2016, a new program called AlphaGo finally won a victory over a human Go champion, only to be beaten by its subsequent versions (AlphaGo Zero and AlphaZero). AlphaZero proceeded to beat the best computers and humans in chess, shogi and Go, including all its predecessors from the Alpha family [1]. Core to AlphaZero's success is its use of a deep neural network, trained through reinforcement learning, as a powerful heuristic to guide a tree search algorithm (specifically Monte Carlo Tree Search). The recent successes of machine learning are good reason to consider the limitations of learning algorithms and, in a broader sense, the limitations of AI. In the context of a particular competition (or'game'), a natural question to ask is whether an absolute winner AI might exist -- one that, given sufficient resources, will always achieve the best possible outcome.


Is Nigeria's Compliance Industry Ready for Challenges of Regulatory Technology? - THISDAYLIVE

#artificialintelligence

Today's customers demand more options, more creative solutions, greater flexibility and faster responses from banks and other financial institutions. Survival and success for financial institutions in this new world requires that they operate with intelligence, agility and speed to keep up with evolving customer preferences and technologies. Consequently, more and more customer interactions and financial transactions are going digital as online and mobile payments, customer on-boarding and account opening are on the rise. Yet, while digital interfaces present an opening for innovative business services, they also yield new challenges, such as pressure on back office operations or increased regulatory scrutiny. Largely automated interactions generate more data to analyse, demand higher volumes of sample testing, and expand the compliance burden. To create a flawless customer experience, the back office has to keep up as well.